-
Notifications
You must be signed in to change notification settings - Fork 146
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Adding new type 'MV256W' and enabling AArch64 STP SIMD semantics. #636
base: master
Are you sure you want to change the base?
Conversation
… involving SIMD registers.
I think the idea of the 64 bit loads/stores was that I didn't have a reliable uint18_t type years ago, and I didn't want to risk it turning into an intrinsic function that took the address of a 128-bit value, and thus took the address of a state structure register, indirectly leading to state escape that could hinder optimization. You could prototype an actual int18_t (or beyond) memory access intrinsic and see if you get better code. Similarly, a very early decision in remill was that I wanted to make the code as easy to run in klee or custom tools, so I didn't want to have to support the myriad vector types, and hence used arrays for everything. This might be worth revisiting, e.g. using |
I think the 64 bits load/store instructions are actually fine for now because they match the logic used by the rest of the semantics (e.g. LDP SIMD semantics) and are easier to work with KLEE/custom tools. Although the AArch64 manual is specifically showing that this instruction would be executing two 128 bits load/store instructions, hence why I had a doubt. This would need to be a long process thats makes sure all the instructions (that work on SIMD registers and memory) use the properly sized memory accesses. |
@fvrmatteo Can you investigate re-enabling this test (and possibly see if it's sane)? https://github.com/lifting-bits/remill/blob/master/tests/AArch64/DATAXFER/STP_n_LDSTPAIR_OFF.S#L67 |
The test itself looks fine and I re-enabled it, although I cannot really run it on M1. |
@fvrmatteo can you investigate the hacks that @tetsuo-cpp tried, and see if it's possible to get M1 support by removing the call frame information dwarf things, i.e. |
…g 'brk #0x0' as AArch64 breakpoint. Using @page and @PAGEOFF.
@pgoodman I did some changes to the AArch64 tests, but I'm a bit wandering in the dark because I don't know how GTest works. The changes I did in the last commit enable the compilation of the AArch64 tests on MacOS ARM64, although when I run them manually I get the following error:
It is unexpected because in the I also turned |
@fvrmatteo Does Clang's assembler support the Otherwise, I am not sure why that comes up with the tests. I will try to look into it. |
@pgoodman As far as I see and from my tests, the Please let me know if you figure out the issue with GTest, in the meantime I'll try figuring out the same. |
Hey @fvrmatteo, I'm taking a look. I'll let you know what I find. |
Ok, I have an idea of what's going on. The reason this is happening is not because there is no Curiously, compilation succeeds without an issue however, if you dump the contents of the I believe this is because the pre-processor macros that we use to populate the test table aren't portable across assemblers: more specifically, the GAS syntax uses At the moment, I'm trying to figure out whether there is a character that isn't a newline that can be used as a statement delimiter in the Clang ARM assembler syntax or whether there is some |
@tetsuo-cpp In the past, I've copied what DynamoRIO does and used |
Haha it's ugly... but it works. Done in ae8f8c9. Now I'm seeing a few test failures but at least the tests are running. Weird thing is that the number of tests failing is different each time so there's some flakiness going on. I don't have time to look at tonight though. |
Had a quick look at this today but still not 100% sure. From what I can see, there are a two different types of mismatches that may or may not be related: The first is a mismatch on system registers. Flags like IXC, OFC and UFC are seeing mismatches. This happens across multiple test cases.
The other is just an inequality when comparing states byte for byte. So it's not clear what part of the state is different unless I sit down and count the size of each struct member.
The other part that's stumping me is the fact that the number of test failures isn't consistent. As far as I can tell, these tests should be deterministic but I haven't looked at every LOC so I don't know for sure. @pgoodman, do you know of any reason why the tests could be failing non-deterministically? |
@pgoodman I'll take a look at this again when I get the chance. Did you have any ideas on why we might have non-determinism in these tests? |
These look like floating point stuff? FPU stuff is typically a bit sketchy and might behave differently in different environments / rounding mode / etc. These one-byte differences look like fpu conditions, and so it's probably that the way we're computing the conditions is not quite right, or it appears right, but the actual non-determinism is due to the bitcode/machine code not being exactly like we need it to be. The way to track flags of fpu operations ends up being pretty brittle: it's something like:
And we rely on these being in-order, with no other stuff sneaking in. But maybe in the code there is:
And then the compiler (un)helpfully reorders as:
|
Hi!
I wanted to take a stab at adding support for the AArch64 STP_Q_LDSTPAIR_OFF semantics.
I prepared a preliminary patch, although there are some things that are not clear to me at the moment: